class: center, middle, inverse, title-slide .title[ # Lecture 2: R Markdown, Version Control with Git(Hub), and Other Productivity Tools ] .author[ ### James Sears*
AFRE 891/991 FS 25
Michigan State University ] .date[ ### .small[
*Parts of these slides are adapted from
“Advanced Data Analytics”
by Nick Hagerty and
“Data Science for Economists”
by Grant McDermott.] ] --- <style type="text/css"> # CSS for including pauses in printed PDF output (see bottom of lecture) @media print { .has-continuation { display: block !important; } } .remark-code-line { font-size: 95%; } .small { font-size: 75%; } .scroll-output-full { height: 90%; overflow-y: scroll; } .scroll-output-75 { height: 75%; overflow-y: scroll; } </style> # Table of Contents 1. [Prologue](#prologue) 1. [R Markdown](#markdown) 1. [Version Control](#control) 1. [GitHub Desktop](#desktop) 1. [Other Tips and Productivity Tools](#tips) 1. [Not Covered: Git(Hub) + RStudio](#gitr) 1. [Not Covered: Troubleshooting Git Credential Issues in RStudio](#gitcred) --- class: inverse, middle name: prologue # Prologue <!-- software installations, checking registrations--> --- # Prologue Before we dive in, let's double check that we all have
Installed [.hi-orange[R]](https://www.r-project.org/).
Installed [.hi-orange[RStudio]](https://www.rstudio.com/products/rstudio/download/preview/).
Signed up for an account on [.hi-orange[Github]](https://github.com/)
Installed [.hi-orange[Git]](https://happygitwithr.com/install-git) and [.hi-orange[Github Desktop]](https://desktop.github.com/)
Log into your Github account on Github Desktop --- class: inverse, middle name: markdown # R Markdown --- #R Markdown Before we dive into version control, let's chat about .hi-medgrn[R Markdown]. -- R Markdown is a document type that allows for integration of R code and output into a Markdown document. .hi-blue[Resources:] - Website: [.hi-orange[rmarkdown.rstudio.com]](https://rmarkdown.rstudio.com) - [.hi-orange[R Markdown Cheatsheet]](https://github.com/rstudio/cheatsheets/raw/main/rmarkdown-2.0.pdf) - Book: [.hi-orange[R Markdown: The Definitive Guide]](https://bookdown.org/yihui/rmarkdown) (Yihui Xie, JJ Allaire, and Garrett Grolemund) --- #R Markdown Before we dive into version control, let's chat about .hi-medgrn[R Markdown]. R Markdown is a document type that allows for integration of R code and output into a Markdown document. .hi-pink[Other points:] - We'll be completing assignments using R Markdown. - FWIW, my lecture slides and notes are all written in R Markdown too. (E.g. This slide deck is built using the [.hi-orange[xaringan]](https://github.com/yihui/xaringan/wiki) package with the metropolis theme.) --- # R Markdown: Getting Started
Installed [R](https://www.r-project.org/).
Installed [RStudio](https://www.rstudio.com/products/rstudio/download/preview/). --
Add the `rmarkdown` package ``` r install.packages("rmarkdown") ``` --
Install LaTeX * If just for this, can use [.hi-orange[TinyTex]](https://yihui.name/tinytex/) ``` r # Install only if you don't have LaTeX already install.packages("tinytex") tinytex::install_tinytex() ``` --- # R Markdown: Creating a New .Rmd File <img src="data:image/png;base64,#images/rmd_add.png" width = "100%"/> --- # R Markdown: Creating a New .Rmd File <img src="data:image/png;base64,#images/rmd_add2.png" width = "100%"/> --- # R Markdown: Creating a New .Rmd File <img src="data:image/png;base64,#images/rmd_add3.png" width = "120%"/> --- # R Markdown Components R Markdown combines 1. .hi-purple[Markdown:] lightweight markup language 1. .hi-pink[LaTeX:] typesetting for math 1. .hi-medgrn[R:] include code and generate output -- <br> <br> Let's do some practice: .hi-slate[open a new .Rmd file] and try adding content as we go --- # Markdown .hi-purple[Markdown] allows for formatting text in a lightweight way I highly recommend the handy [.hi-orange[Markdown Guide]](https://www.markdownguide.org/) for more details --- # Markdown: Heading .hi-blue[Headings] emphasize text and add chunks to your script Largest headingwith one leading \# (slide title above) ## Second Largest (\#\#) ### Third Largest (\#\#\#) #### Getting Smaller... (\#\#\#\#) Normal Text for comparison --- # Markdown: Text Format **Bold text** with \*\*your text\*\* *Italicize* with \*single asterisks\* Add `code text` with grave accents (the back tick symbol) * ` * The other output of the tilde key `~` on keyboard End a line with two spaces to start a new paragraph * or leave a line space between sentences Can also start a new line with backslash (\\) --- # Markdown: Text Format Add superscripts<sup>2</sup> with ^carets^ Add ~~strikethroughs~~ with \~\~double tildes\~\~ Add a line break (horizontal rule) *** with \*\*\* --- # Markdown: Text Format Draw .hi-pink[tables] using | and - .center[ ``` | Col A | Col B | Col C| |-------|-------|------| | This | is | a | | Table | | wow | ``` ] --- # Markdown: Text Format Draw .hi-pink[tables] using | and - | Col A | Col B | Col C| |-------|-------|------| | This | is | a | | Table | | wow | --- # Markdown: Text Format You can adjust the .hi-medgrn[alignment] of table text by adding `:`'s in the second row: - `:----` for left-aligned - `:---:` for center-aligned - `----:` for left-aligned .center[ ``` | Column A | Column B | Column C| |:--- |:---:| ---:| | Col A | is | left-aligned | | Col B | is | center-aligned | | Col C | is | right-aligned | ``` ] --- # Markdown: Text Format You can adjust the .hi-medgrn[alignment] of table text by adding `:`'s in the second row: - `:----` for left-aligned - `:---:` for center-aligned - `----:` for left-aligned | Column A | Column B | Column C| |:--- |:---:| ---:| | Col A | is | left-aligned | | Col B | is | center-aligned | | Col C | is | right-aligned | --- # Markdown: Lists Add an .hi-purple[ordered list] with .hi-purple[1.] 1. First Item 1. Second Item 1. No need to change the number - keep using 1. It will automatically update. -- Add an .hi-medgrn[unordered list] with .hi-medgrn[\* or \-] * A thing * Another related thing - Indent to nest 1. Can mix ordered and unordered --- # Markdown: Inputs Add a [link](https://www.markdownguide.org/cheat-sheet/) with \[\]() * \[text label\](URL) * Add direct link with <link> <https://www.markdownguide.org> -- Add an image with !\[\]() * !\[alt text](URL) .hi-medgrn[practice] by adding `images/smile.png`:  --- # Markdown: LaTeX Another advantage of Markdown is that it integrates  functionality for typesetting math. -- Add an .hi-purple[inline equation] with $TeX$ `\(Var(X) = \sum\limits_{i=1}^n \frac{(x_i - \bar{x})^2}{n} ~~~~ ~~~ Y_{it} = \beta_0 + \beta_1 X_{it} + \epsilon_{it}\)` -- Add multiple rows of LaTeX with $$ LaTeX lines here $$ Use the [.hi-orange[standard LaTeX commands]](https://kapeli.com/cheat_sheets/LaTeX_Math_Symbols.docset/Contents/Resources/Documents/index) for symbols/characters --- # Rmd: R Code R code is primarily executed with .hi-blue[code chunks] -- Add a chunk with * `Cmd + Option + I (Ctrl + Alt + I on PC)` * The `Insert` button in the UI * Manually type  --- # Rmd: Code Chunks <img src="data:image/png;base64,#images/chunk.png" width = "120%"/> .hi-blue[Code chunks] allow us to add as many lines of code as we want * Output will appear underneath after executing the full chunk * Can customize whether it runs, how output is displayed * Can run manually * Line by line with `Cmd/Ctrl + Enter` * Entire chunk with `Run Entire Chunk` button --- # Code Chunk Options You can .hi-blue[add chunk options] in brackets after `r` and separated by commas. Some commonly-used options include: * .hi-slate[Chunk label] (`ex_chunk`) * `include = FALSE` will run the chunk but hide it from the final document * `eval = FALSE` will display code without evaluating it * `results = 'hide'` runs code but hides output from the final document <img src="data:image/png;base64,#images/chunk_opts.png" width = "80%"/> --- # Code Chunk Options You can also .hi-blue[change the color] of code chunks and/or output in the rendered document through code chunk options<sup>1</sup> * `class.source` to change the .hi-medgrn[code chunk] * `output.source` to change the .hi-red[output] -- .pull-left[ .hi-slate[default] <img src="data:image/png;base64,#images/class_def.png" width = "80%"/> ] .pull-right[ .hi-slate[`bg-primary`] <img src="data:image/png;base64,#images/class_primary.png" width = "80%"/> ] .footnote[<sup>1</sup> We'll chat more about this when we get to web scraping, but what's actually happening here is that we're harnessing some built-in CSS classes to change the backgrounds. This also means that, using CSS, you can define custom classes and format things however you'd like.] --- # Code Chunk Options You can also .hi-blue[change the color] of code chunks and/or output in the rendered document through code chunk options<sup>1</sup> * `class.source` to change the .hi-medgrn[code chunk] * `output.source` to change the .hi-red[output] .pull-left[ .hi-slate[`bg-success`] <img src="data:image/png;base64,#images/class_success.png" width = "80%"/> ] .pull-right[ .hi-slate[`bg-info`] <img src="data:image/png;base64,#images/class_info.png" width = "80%"/> ] .footnote[<sup>1</sup> We'll chat more about this when we get to web scraping, but what's actually happening here is that we're harnessing some built-in CSS classes to change the backgrounds. This also means that, using CSS, you can define custom classes and format things however you'd like.] --- # Code Chunk Options You can also .hi-blue[change the color] of code chunks and/or output in the rendered document through code chunk options<sup>1</sup> * `class.source` to change the .hi-medgrn[code chunk] * `output.source` to change the .hi-red[output] .pull-left[ .hi-slate[`bg-warning`] <img src="data:image/png;base64,#images/class_warning.png" width = "80%"/> ] .pull-right[ .hi-slate[`bg-danger`] <img src="data:image/png;base64,#images/class_danger.png" width = "80%"/> ] .footnote[<sup>1</sup> We'll chat more about this when we get to web scraping, but what's actually happening here is that we're harnessing some built-in CSS classes to change the backgrounds. This also means that, using CSS, you can define custom classes and format things however you'd like.] --- # Rmd: Inline Code You can call R objects from earlier chunks .hi-medgrn[inline] with  r  ``` r four = 2+2 ``` This can output in line with text: 2 + 2 = 4 --- class: inverse, middle # R Markdown File Organization --- # 1. Header .pull-left[ RStudio automatically builds the R Markdown file from a template, which begins with a .hi-medgrn[header] * Title * Author * Date * Output Format * Main options<sup>1</sup>: HTML (`html_document`), PDF (`pdf_document`), LaTeX (`latex_document`), or Word (`word_document`)<sup>2</sup> ] .pull-right[ <img src="data:image/png;base64,#images/header.png" width = "120%"/> .font80[1: See [.hi-orange[CH 3 of "R Markdown: The Definitive Guide" for more on how to customize output formats]](https://bookdown.org/yihui/rmarkdown/documents.html)] .font80[2: For .hi-medgrn[better formatted Word output] with greater customisability, use the **officedown** package's [`rdox_document`](https://davidgohel.github.io/officedown/reference/rdocx_document.html) format.] ] --- # 2. R Setup By default, RStudio adds a .hi-blue[setup] code chunk next. <img src="data:image/png;base64,#images/r_setup.png" width = "120%"/> * Can set global options * Useful as your preamble * For [.hi-orange[R Notebooks]](https://bookdown.org/yihui/rmarkdown/notebook.html), this will automatically be run and is the only place where you can change your working directory --- # 3. Contents From here on you can build the report/notebook as needed for the task. * Add any writing and outside graphics or [.hi-orange[bibTeX citations]](https://bookdown.org/yihui/rmarkdown-cookbook/bibliography.html) * Add code chunks to carry out desired analysis * Employ sections and formatting to structure the document as desired --- # Compiling/Knitting When you are ready to compile your final document, use the `Knit` button or `Ctrl/Cmd + Shift + K` <img src="data:image/png;base64,#images/rmd_knit.png" width = "100%"/> --- # R Markdown: Knit to Compile Output (HTML, PDF) <img src="data:image/png;base64,#images/rmd_knit2.png" width = "120%"/> --- class: inverse, middle # Markdown Practice! --- # Markdown Practice 1. Create a new R Markdown file named "R-Markdown-Ex.Rmd" 1. In the setup chunk, load the **dslabs** and **tidyverse** packages * Use the `data()` function to read in the `divorce_margarine` dataset 1. Add a header labeled "Correlation vs. Causation" and a text explanation below for why we often want to differentiate between the two 1. Add a code chunk with the label `plot` * Type the following code: ``` ggplot(divorce_margarine) + geom_point(aes(x = margarine_consumption_per_capita, y = divorce_rate_maine)) + labs(title = "Relationship between Margarine Consumption and Divorce Rates in Maine", subtitle = "2000-2009", x = "Margarine Consumption per Capita", y = "Divorce Rate") ``` 1. Knit and save a PDF/HTML copy of the file to the "output" folder --- class: inverse, middle name: control # Version Control <!-- basics of version control, why do it, different version --> --- # Why Use Version Control .center[ <img src = "data:image/png;base64,#images/phd052810s.gif" height = "100%"/> ] --- # Goals of Version Control While building project folders with the above naming conventions is *fun*, a good .hi-medgrn[version control system] can solve this problem. * Save each set of changes sequentially * Keep track of different versions of a file * Merge changes from multiple versions/sources --- # Git(Hub) Solves this Problem ### Git - .hi-medgrn[Git] is a .hi-medgrn[distributed version control system] - Each team member has a .hi-blue[local copy] of files on their computer - Imagine if Dropbox and the "Track changes" feature in MS Word had a baby. Git would be that baby. - In fact, it's even better than that because Git is optimised for the things that economists and data scientists spend a lot of time working on (e.g. code). - There is a learning curve, but I promise you it's worth it. --- # Git(Hub) Solves this Problem ### GitHub - It's important to realise that .hi-medgrn[Git] and .hi-purple[GitHub] are distinct things. - .hi-purple[GitHub] is an .hi-purple[online hosting platform] that provides an array of services built on top of the .hi-medgrn[Git] system. (Similar platforms include Bitbucket and GitLab.) - Just like we don't *need* .hi-red[Rstudio] to run .hi-pink[R] code, we don't *need* .hi-purple[GitHub] to use .hi-medgrn[Git]... but it will make our lives so much easier. --- # Git(Hub) for Scientific Research .hi-slate[From software development...] - .hi-medgrn[Git] and .hi-purple[GitHub]'s role in global software development is not in question. - There's a high probability that your favorite app, program or package is built using Git-based tools. (RStudio is a case in point.) .hi-slate[... to scientific research] - Benefits of VC and collaboration tools aside, Git(Hub) helps to operationalise the ideals of open science and reproducibility.<sup>2</sup> - Journals have increasingly strict requirements regarding reproducibility and data access. GH makes this easy (DOI integration, off-the-shelf licenses, etc.) - I host [.hi-orange[teaching materials]](https://github.com/searsjm) on GH. I even use it to host and maintain my [.hi-orange[website]](https://github.com/searsjm/searsjm.github.io) for free. .footnote[2: [.hi-orange[Democratic databases: Science on GitHub (Nature)]](https://www.nature.com/news/democratic-databases-science-on-github-1.20719) (Perkel, 2016).] --- # Using GitHub There are a couple of different main ways that we could use GitHub: -- 1\. Through [github.com](https://github.com/) Only * .hi-medgrn[Pros:] doesn't require any software/local repo copies * .hi-red[Cons:] much more time-intensive and not automated (the whole point of this thing!) -- 2\. Through the command line ([GitHub CLI](https://cli.github.com/)) * .hi-medgrn[Pros:] fully programmatic, requires no additional software * .hi-red[Cons:] fully programmatic! --- # Using GitHub There are a couple of different main ways that we could use GitHub: 3\. Integrated with RStudio * .hi-medgrn[Pros:] fully integrates RStudio projects * .hi-red[Cons:] limited to R projects, doesn't play as nicely with GitHub Classroom -- 4\. Through the GitHub Desktop App * .hi-medgrn[Pros:] Intuitive GUI, sync any kinds of files/projects * .hi-red[Cons:] requires an extra piece of software, [GitHub Desktop](https://desktop.github.com/download/) We're going to focus primarily on **4**, but the back of the deck will contain slides working through **3** if you want to experiment with that too. --- class: inverse, middle name: desktop # GitHub Desktop --- # Version Control with GitHub Desktop Although GitHub integration with RStudio has lots of functionality, there are times where we want to keep track of files and projects .hi-medgrn[outside of RStudio]. * For example, when you want version control of projects that .hi-blue[don't only use R] (or don't use it at all) This is where .hi-purple[GitHub Desktop] comes in. --- # Version Control with GitHub Desktop This next section is about learning the basic Git(Hub) commands and the recipe for successful version control with GitHub Desktop. I also want to bookmark a general point that we'll revisit many times during this course: - The tools that we're using all form part of a coherent .hi-medgrn[data science ecosystem]. - Greatly reduces the cognitive overhead ("aggregation") associated with traditional workflows, where you have to juggle multiple programs and languages at the same time. --- # Github Desktop Workflow With GitHub, if we were working 1. On our own 2. From a single computer we could just follow the below workflow: <br> .center[ <img src="images/workflow1.png" height=100%> ] Since we're collaborating with others/potentially across machines, we'll also add in a few more actions. --- # Main Git operations The first Git operation in the workflow is .hi-slate[Cloning] -- .center[ <img src="images/dolly.png" width="600"> ] --- # (Git) Cloning No, not *that* kind of cloning. .hi-slate[Cloning:] making a local copy of a .hi-blue[GitHub Repository] (repo for short). In order to clone a repo, we first need a repo to clone. Let's start by cloning our [Course repo!](https://github.com/afre-msu/AFRE-891-991-FS25) We can do this 1. From the repo page on github.com (direct link or SSH) 2. From GitHub Desktop directly --- # Course Repo Cloning (github.com) Let's start with the first approach. With GitHub Desktop installed/open, navigate to our [class web page repo](https://github.com/afre-msu/AFRE-891-991-FS25). .center[ <img src="images/clone_course_repo_1.png" height=100%> ] --- # Course Repo Cloning Click the green .hi-medgrn[Code] button and then "Open with GitHub Desktop" .center[ <img src="images/clone_course_repo_2.png" height=100%> ] --- # Course Repo Cloning Allow your web browser to open the link. This will redirect you to GitHub Desktop, automatically adding in the repo URL. Next, choose where you want the local copy of the repo file's saved and hit .hi-blue[Clone] .center[ <img src="images/clone_course_repo_3.png" width = "850"> ] --- # Course Repo Cloning Wait a little bit, and navigate to the local path you gave it. Voila! .center[ <img src="images/clone_course_repo_4.png" height=100%> ] --- # Creating a Repo While there are *tons* of awesome repositories on GitHub that we might want to clone and use as our starting point, other times we want to .hi-slate[create our own repo!] Let's see how to do this from github.com. Start by creating your own repo on GitHub — call it "test" — and clone it, this time using the HTTPS link.<sup>3</sup> .footnote[<sup>3</sup> It's easiest to start with HTTPS, but <a href="http://happygitwithr.com/ssh-keys.html#ssh-keys" target="_blank">SSH</a> is advised for more advanced users.] --- # Create a Repository on GitHub (Repo) <img src="images/create_repo1.png" height=100%> --- # Create a Repository on GitHub (Repo) .center[ <img src="images/create_repo2.png" width= "750"> ] --- # Create a Repository on GitHub (Repo) .center[ <img src="images/create_repo3.png" width= "850"> ] --- # Create a Repository on GitHub (Repo) .center[ <img src="images/create_repo4.png" height=100%> ] --- # Clone a Repository from GitHub (HTTPS) .center[ <img src="images/create_repo5.png" height=100%> ] --- # Clone a Repository from GitHub (HTTPS) .center[ <img src="images/create_repo6.png" height=100%> ] --- # Clone a Repository from GitHub (HTTPS) <br> <br> <br> .center[ <img src="images/create_repo7.png" height=100%> ] --- # Making Local Changes Let's start by making some changes to our local copies of the test repo files. Open up the README file. Add some text and save the README. .pull-left[ .center[ <img src="images/edit_readme_1.png" height=100%> ] ] .pull-right[ .center[ <img src="images/edit_readme_2.png" height=100%> ] ] --- # Making Local Changes Open up Github Desktop. Do you see any changes? .center[ <img src="images/stage_desktop.png" height=100%> ] --- # Stage Github Desktop detects local changes and automatically .hi-slate[stages] those changes * Tells Git that you edited the `README.md` file and documents *how* it changed from the upstream repo version .center[ <img src="images/stage_desktop.png" width="100%"> ] --- # Commit and Push For now, these changes are .hi-medgrn[local to our own repo copy]. .pull-left[ .center[ .hi-slate[Local System] <img src="images/edit_readme_2.png" height=100%> ] ] .pull-right[ .center[ .hi-slate[GitHub] <img src="images/edit_readme_3.png" height=100%> ] ] In order for them to .hi-blue[update the main repo itself], we need to do two actions: .hi-slate[Commit] and .hi-slate[Push]. --- # Commit Github Desktop automatically stages any changes it detects, so that's part of the trouble taken care of already! Now we need to .hi-slate[Commit:] tell Git that, yes, you are sure these staged changes should be part of the repo history. --- # Commit To .hi-slate[commit] in GitHub Desktop, we need to first give our commit a name and a description (bottom-left): .center[ <img src="images/commit_desktop.png" width="85%"> ] and then hit .hi-slate[Commit]. --- # Push Next we need to .hi-slate[Push:] send the local file changes/additions to update the repo on GitHub. In the upper-right corner you should now see the option to .hi-slate[Push origin]: .center[ <img src="images/push_desktop_1.png" height=100%> ] ] --- # Push Click it and wait. That's it! .center[ <img src="images/push_desktop_2.png" height=100%> ] --- # Changes Now Visible on GitHub .center[ <img src="images/push_desktop_3.png" height=100%> ] --- # Recap Here's a step-by-step summary of what we just did. - Cloned a repo from GitHub - Made same changes to a repo file and saved them locally. - .hi-blue[Staged] these local changes (okay, Github Desktop did that for us). - .hi-medgrn[Committed] these local changes to our Git history with a helpful message. - .hi-pink[Pushed] our changes to the GitHub repo. This completes the single collaborator/computer workflow. -- But what if we're in a multiple author and/or multiple system situation? .hi-slate[A:] we need to bring the last Git Action: .hi-purple[Pull]. --- # Pull Just like we .hi-pink[Pushed] our downstream changes *to* the GitHub repo, we can also .hi-purple[Pull] upstream changes *from* the repo * Overwrites our current version of files with the newer versions from the GitHub repo When working with others, it's .hi-slate[critical] to .hi-purple[Pull] updates from the GitHub repo just in case anyone else made changes too (not expected here, but good practice). Aside: Always pull from the upstream repo *before* you push any changes. Seriously, do this even on solo projects; making it a habit will save you headaches down the road. --- # Pull To set up a possible pull, let's make some .hi-medgrn[upstream changes]. On Github, click on the plus symbol to .hi-slate[add a new file]. .center[ <img src="images/edit_github_0.png" height=100%> ] --- # Pull .center[ <img src="images/edit_github_1.png" height=100%> ] --- # Pull .center[ <img src="images/edit_github_2.png" height=100%> ] --- # Pull .center[ <img src="images/edit_github_3.png" height=100%> ] --- # Pull Flip back over to GitHub Desktop. One of two things may have happened. 1. Your GitHub Desktop already detected the upstream changes .center[ <img src="images/pull_desktop_1.png" height=100%> ] If you just see .hi-slate[fetch origin], click it to refresh and retrieve the changes. --- # Pull Click the .hi-slate[pull origin] button to add the new file to your local clone * Same thing would have happened if someone had modified existing files .center[ <img src="images/pull_desktop_2.png" height=100%> ] --- # Why this Workflow: GitHub Creating the repo on GitHub first means that it will .hi-medgrn[always be "upstream"] of your (and any other) local copies. - In effect, this allows GitHub to act as the .hi-purple[central node] in the distributed VC network. - Especially valuable when you are collaborating on a project with others, but also has advantages when you are working alone. - If you would like to move an existing project to GitHub, my advice is still to .hi-medgrn[create an empty repo there first], clone it locally, and then copy all your files across. --- class: inverse, middle name: tips # Other Tips and Productivity Tools --- # Productivity Miscellanea What follows are miscellaneous things that I find .hi-purple[improve my productivity] * .hi-medgrn[Synced Cloud Storage] (SpartanDrive or Dropbox/Box) * .hi-blue[Overleaf] for LaTeX collaboration * .hi-pink[Connected Papers] for literature networks --- # Synced Cloud Storage and OneDrive .hi-medgrn[Synced Cloud Storage] is hugely beneficial if you work across .hi-pink[multiple computers] or .hi-blue[with many collaborators]. * Make sure each computer has the most up-to-date version of all your files * Renders flash drives almost entirely obselete! --- # Synced Cloud Storage and OneDrive One easy way to do this: .hi-purple[SpartanDrive/OneDrive] All faculty + staff get .hi-blue[5 TB of free storage] on [.hi-orange[SpartanDrive (MSU's version of OneDrive)]](https://tdx.msu.edu/TDClient/32/Portal/KB/ArticleDet?ID=1169#ANCHOR_Topic%20Two) .pull-left[ ## Pros * Free * Syncable desktop apps * Part of the MSU Office365 ecosystem ] .pull-right[ ## Cons * Part of the MSU Office365 ecosystem * Limited storage (5TB, 250gb max filesize) * Sometimes finicky ] --- # Synced Cloud Storage Alternatives to SpartanDrive: * .hi-blue[Dropbox]: 2GB free ($10/mo for 2TB) * .hi-pink[Box:] 10GB free ($10/mo for 100GB) -- Ultimate choice of platform may depend on coauthors + your current choice, but it's a good idea to .hi-purple[download the desktop app] to keep your files synced + backed up! * Also gives basic version control --- # Overleaf If you typeset using LaTeX, [.hi-orange[Overleaf]](https://www.overleaf.com/) streamlines access and collaboration .less-right[ .hi-blue[Free Version:] remotely host all files, access for you + 1 collaborator .hi-medgrn[Paid ($7.40/mo):] track changes + full document history, Git(hub) + Dropbox integration, up to 6 collaborators ] .more-left[ <img src="data:image/png;base64,#images/overleaf.png" width = "120%"/> ] --- # Connected Papers .more-left[ <img src="data:image/png;base64,#images/connectedpapers.png" width = "110%"/> ] .less-right[ * Graphical representation of paper networks * Visualize literature as directed graph * 5 free per month ([.hi-orange[unlimited for $6/mo]](https://www.connectedpapers.com/pricing)) ] --- # Productivity Miscellanea <br> <br> .font200[ .center[ .hi-dkgrn[Your Productivity Tips + Tricks?] ] ] --- class: inverse, middle name:gitr # Git(Hub) and RStudio --- # Git(Hub) Integration in RStudio One of the (many) features of RStudio is how well it integrates version control into your everyday workflow. - Even though Git is a completely separate program to R, they feel like part of the same "thing" in RStudio. - This next section is about learning the basic Git(Hub) commands and the recipe for successful project integration with RStudio. -- I also want to bookmark a general point that we'll revisit many times during this course: - The tools that we're using all form part of a coherent .hi-medgrn[data science ecosystem]. - Greatly reduces the cognitive overhead ("aggregation") associated with traditional workflows, where you have to juggle multiple programs and languages at the same time. --- # Link a GitHub Repo to an RStudio Project The starting point for our workflow is to link a GitHub repository (i.e. "repo") to an RStudio Project. Here are the steps we're going to follow: 1. Create the repo on GitHub and initialize with a README. 2. Copy the HTTPS/SSH link (the green "Clone or Download" button). 3. Open up RStudio. 4. Navigate to **File -> New Project -> Version Control -> Git**. 5. Paste your copied link into the "Repository URL:" box. 6. Choose the project path ("Create project as subdirectory of:") and click **Create Project**. Let's practice this using our existing test repo. Start by copying the repo link. --- # Create new R Project To start, we need to create an .hi-medgrn[R Project] * A project (i.e. folder) specific to a given task Open a new RStudio window and click .hi-slate["File > New Project"] --- # Version Control <img src="images/add_rstudio1.png" height=100%> --- # Choose Git <img src="images/add_rstudio2.png" height=100%> --- # Add Repo info <img src="images/add_rstudio3.png" height=100%> --- # Add Repo into RStudio <img src="images/add_rstudio4.png" height=100%> --- # Making Local Changes Look at the .hi-medgrn[top-right panel] in your RStudio IDE. Do you see the .hi-blue[Git] tab? - Click on it. - There should already be some files in there, which we'll ignore for the moment.<sup>4</sup> Now open up the README file (see the "Files" tab in the bottom-right panel). - Add some text and save the README. - Do you see any changes in the .hi-blue[Git] panel? Good. (Raise your hand if not.) .footnote[<sup>4</sup> They're important, but not for the purposes of this section.] --- # Stage and Commit .center[ <img src="images/stage_commit1.png"> ] --- # Stage and Commit .center[ <img src="images/stage_commit2.png"> ] --- # Stage and Commit .center[ <img src="images/stage_commit3.png"> ] --- # Pull .center[ <img src="images/pull.png"> ] --- # Push .center[ <img src="images/push.png"> ] --- # Sign RStudio into Github .center[ <img src="images/push_signin.png" height="550"> ] --- # Sign RStudio into Github .center[ <img src="images/push_signin2.png" height= "550"> ] --- # Changes Now Visible on Github .center[ <img src="images/github_changes.png" height=100%> ] --- # Changes Now Visible Locally Too .center[ <img src="images/local_changes.png" height=100%> ] --- # Recap Here's a step-by-step summary of what we just did. - Made same changes to a file and saved them locally. - .hi-blue[Staged] these local changes. - .hi-medgrn[Committed] these local changes to our Git history with a helpful message. - .hi-purple[Pulled] from the GitHub repo just in case anyone else made changes too (not expected here, but good practice). - .hi-pink[Pushed] our changes to the GitHub repo. PS — You were likely challenged for your GitHub credentials at some point. Learn how to cache these [here](https://happygitwithr.com/credential-caching.html). --- # Why this Workflow: RStudio .hi-blue[RStudio Projects] are great. - They interact seamlessly with Git(Hub), as we've just seen. - They also solve absolute vs. relative path problems, since the .Rproj file acts as an anchor point for all other files in the repo.<sup>5</sup> Note however that R Markdown files do the same thing re: paths *and* extend functionality to incorporating text, images, latex, and html. .footnote[<sup>5</sup> Calling files from their full `YourComputer/YourName/Documents/Special-Subfolder/etc` paths in your scripts is the enemy of reproducibility!] --- class: inverse, middle name: gitcred # Troubleshooting Git Credential Issues in RStudio --- # Troubleshooting Git Credential Issues in RStudio One reason I'm not focusing on the GitHub + RStudio integration is .hi-slate[credential errors.] For instance, do you get a password authentication error when trying to push to GitHub from RStudio? .center[ <img src = "images/gitcred_rstudio.png" width = "700" /src> ] --- # Troubleshooting Steps To begin, install the `usethis` package: ``` r pacman:p_load(usethis) # or: # install.packages("usethis") # followed by # library(usethis) ``` --- # Troubleshooting Steps Next, run the following: ``` r library("usethis") git_sitrep() ``` -- View the output: do you see any Xs or lines saying "lacks recommended scopes", "error", or "can't retrieve"? --- # Create a Personal Access Token (PAT) To fix these errors, we'll create a .hi-medgrn[Personal Access Token (PAT)] on GitHub. The below code will open a browser window - follow the steps to create a PAT ``` r usethis::create_github_token() ``` .center[ <img src = "images/gitcred_pat.png" width = "550" /src> ] --- # Add a PAT to RStudio Back in RStudio, run ``` r gitcreds::gitcreds_set() ``` Follow the prompts to add/replace existing credentials, pasting in the PAT. --- # Check PAT Once added, run `git_sitrep` one last time to verify. If all looks good, try pushing again! .center[ <img src = "images/sitrep.png" width = "550" /src> ] --- # Table of Contents 1. [Prologue](#prologue) 1. [R Markdown](#markdown) 1. [Version Control](#control) 1. [GitHub Desktop](#desktop) 1. [Other Tips and Productivity Tools](#tips) 1. [Not Covered: Git(Hub) + RStudio](#gitr) 1. [Not Covered: Troubleshooting Git Credential Issues in RStudio](#gitcred)